{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Downloading and registering \"catalogue\" datasets\n",
    "\n",
    "The geoslurp tools come with a set of \"standard\" datasets which are ready to be downloaded and used in the database. When  running tasks using the [geoslurper.py command line tool](../examples/usingcli.html#using-the-command-line-tool-geoslurp), it will consult a [cached catalogue](../reference/geoslurp.config.html#geoslurp.config.catalogue.DatasetCatalogue) with dataset classes. Whenever a dataset class is needed it will be instantiated, so the *pull* and *register* routines can be called on it.\n",
    "\n",
    "Although this is covered by the functionality of the *geoslurper.py* script, the catalogue can also be consulted directly in other python scripts. \n",
    "\n",
    "The following example demonstrates how the catalogue is used to download and register a dataset from the [natural earth collection](https://www.naturalearthdata.com).\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 6,
   "metadata": {},
   "outputs": [],
   "source": [
    "from geoslurp.config.catalogue import geoslurpCatalogue\n",
    "from geoslurp.db import Settings\n",
    "from geoslurp.config import setInfoLevel\n",
    "\n",
    "from geoslurp.db import geoslurpConnect\n",
    "\n",
    "setInfoLevel()\n",
    "\n",
    "gpcon=geoslurpConnect(readonlyuser=False) # this will be a connection based on the readonly userfrom geoslurp.db.geo\n",
    "\n",
    "#Some datasets need info from the server side settings so we need to load these\n",
    "conf=Settings(gpcon)\n",
    "\n",
    "#refresh catalogue (note this only needs to be done when a catalogue exists already but new classes have been added to the paths)\n",
    "# geoslurpCatalogue.refresh(conf)\n",
    "\n",
    "\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once the catalogue is loaded, we can query for dataset classes using regular expressions. The catalogue will return classes, which still need to be instantiated in order to be useful. \n",
    "\n",
    "As a remark: Note that the following operation could also have been achieved with the following shell command:\n",
    "```\n",
    "geoslurper.py --pull --register -d \"globalgis.ne_110m_admin_1_states_provinces.\\*\"\n",
    "```"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "<class 'abc.ne_110m_admin_1_states_provinces'>\n",
      "<class 'abc.ne_110m_admin_1_states_provinces_lakes'>\n",
      "<class 'abc.ne_110m_admin_1_states_provinces_lines'>\n",
      "<class 'abc.ne_110m_admin_1_states_provinces_scale_rank'>\n"
     ]
    }
   ],
   "source": [
    "# find all datasetclasses which obey a certain regex (needs to match the entire string)\n",
    "for dsclass in geoslurpCatalogue.getDatasets(conf,\"globalgis\\.ne_110m_admin_1_states_provinces.*\"):\n",
    "    # create an instance of the class\n",
    "    dsobject=dsclass(gpcon)\n",
    "    dsobject.pull()\n",
    "    dsobject.register()\n"
   ]
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "pyrr",
   "language": "python",
   "name": "pyrr"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.8.1"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}
